4 research outputs found

    Isolate First, Then Share: a New OS Architecture for Datacenter Computing

    Full text link
    This paper presents the "isolate first, then share" OS model in which the processor cores, memory, and devices are divided up between disparate OS instances and a new abstraction, subOS, is proposed to encapsulate an OS instance that can be created, destroyed, and resized on-the-fly. The intuition is that this avoids shared kernel states between applications, which in turn reduces performance loss caused by contention. We decompose the OS into the supervisor and several subOSes running at the same privilege level: a subOS directly manages physical resources, while the supervisor can create, destroy, resize a subOS on-the-fly. The supervisor and subOSes have few state sharing, but fast inter-subOS communication mechanisms are provided on demand. We present the first implementation, RainForest, which supports unmodified Linux binaries. Our comprehensive evaluation shows RainForest outperforms Linux with four different kernels, LXC, and Xen in terms of worst-case and average performance most of time when running a large number of benchmarks. The source code is available soon.Comment: 14 pages, 13 figures, 5 table

    AIBench Training: Balanced Industry-Standard AI Training Benchmarking

    Full text link
    Earlier-stage evaluations of a new AI architecture/system need affordable AI benchmarks, while using a few AI component benchmarks alone in the other stages may lead to misleading conclusions. This paper proposes a balanced benchmarking methodology. Performing an exhaustive survey on Internet service AI domains, we identify and implement seventeen representative AI tasks with the state-of-the-art models to guarantee the diversity and representativeness of the benchmarks. Meanwhile, we keep a benchmark subset to a minimum for affordability. We contribute by far the most comprehensive AI training benchmark suite with seventeen industry partners. The evaluations show: (1) AIBench Training outperforms MLPerf Training in terms of the diversity and representativeness of model complexity, computational cost, convergent rate, computation and memory access patterns, and hotspot functions; (2) With respect to the AIBench full benchmarks, its subset shortens the benchmarking cost by 54%, while maintaining the primary workload characteristics; (3) The performance ranking shows the single-purpose AI accelerator like TPU with the optimized TensorFlow framework performs better than that of GPUs while losing the latters' general support for a variety of AI models. The AIBench Training specifications, source code, testbed, and performance numbers are publicly available from the web site http://www.benchcouncil.org/AIBench/index.html

    AIBench: An Industry Standard Internet Service AI Benchmark Suite

    Full text link
    Today's Internet Services are undergoing fundamental changes and shifting to an intelligent computing era where AI is widely employed to augment services. In this context, many innovative AI algorithms, systems, and architectures are proposed, and thus the importance of benchmarking and evaluating them rises. However, modern Internet services adopt a microservice-based architecture and consist of various modules. The diversity of these modules and complexity of execution paths, the massive scale and complex hierarchy of datacenter infrastructure, the confidential issues of data sets and workloads pose great challenges to benchmarking. In this paper, we present the first industry-standard Internet service AI benchmark suite---AIBench with seventeen industry partners, including several top Internet service providers. AIBench provides a highly extensible, configurable, and flexible benchmark framework that contains loosely coupled modules. We identify sixteen prominent AI problem domains like learning to rank, each of which forms an AI component benchmark, from three most important Internet service domains: search engine, social network, and e-commerce, which is by far the most comprehensive AI benchmarking effort. On the basis of the AIBench framework, abstracting the real-world data sets and workloads from one of the top e-commerce providers, we design and implement the first end-to-end Internet service AI benchmark, which contains the primary modules in the critical paths of an industry scale application and is scalable to deploy on different cluster scales. The specifications, source code, and performance numbers are publicly available from the benchmark council web site http://www.benchcouncil.org/AIBench/index.html.Comment: 24 page

    AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite

    Full text link
    Domain-specific software and hardware co-design is encouraging as it is much easier to achieve efficiency for fewer tasks. Agile domain-specific benchmarking speeds up the process as it provides not only relevant design inputs but also relevant metrics, and tools. Unfortunately, modern workloads like Big data, AI, and Internet services dwarf the traditional one in terms of code size, deployment scale, and execution path, and hence raise serious benchmarking challenges. This paper proposes an agile domain-specific benchmarking methodology. Together with seventeen industry partners, we identify ten important end-to-end application scenarios, among which sixteen representative AI tasks are distilled as the AI component benchmarks. We propose the permutations of essential AI and non-AI component benchmarks as end-to-end benchmarks. An end-to-end benchmark is a distillation of the essential attributes of an industry-scale application. We design and implement a highly extensible, configurable, and flexible benchmark framework, on the basis of which, we propose the guideline for building end-to-end benchmarks, and present the first end-to-end Internet service AI benchmark. The preliminary evaluation shows the value of our benchmark suite---AIBench against MLPerf and TailBench for hardware and software designers, micro-architectural researchers, and code developers. The specifications, source code, testbed, and results are publicly available from the web site \url{http://www.benchcouncil.org/AIBench/index.html}.Comment: 25 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:1908.0899
    corecore